home *** CD-ROM | disk | FTP | other *** search
- Path: engnews1.Eng.Sun.COM!taumet!clamage
- From: John Max Skaller <maxtal@suphys.physics.su.oz.au>
- Newsgroups: comp.std.c++
- Subject: Re: Guarantees concerning references/pointers into string
- Date: 28 Jan 1996 23:24:34 GMT
- Organization: MAXTAL
- Approved: clamage@eng.sun.com (comp.std.c++)
- Message-ID: <310BF7E7.4DDE@suphys.physics.su.oz.au>
- References: <KANZE.96Jan25184235@gabi.gabi-soft.fr>
- NNTP-Posting-Host: 129.146.88.126
- Mime-Version: 1.0
- Content-Type: text/plain; charset=us-ascii
- Content-Transfer-Encoding: 7bit
- X-Nntp-Posting-Host: slsyd4p22.ozemail.com.au
- X-Mailer: Mozilla 2.0b6a (WinNT; I)
- Content-Length: 5440
- Originator: clamage@taumet
-
- J. Kanze wrote:
-
- > a = "A lot of junk" ;
- > a.reserve( 100 ) ;
- > char* p1 = &a[ 1 ] ;
- > // b = a ;
- > a.insert( 6 , "---" ) ;
- > char* p2 = &a[ 1 ] ;
- > cout << "Reserve test: " << (p1 == p2 ? "Passed" : "Failed") << endl ;
- >
- > As I interpret the standard, the above is required to work.
-
- I agree. It should work.
-
-
- > The presense of
- > `reserve' makes copy on write anything but trivial to implement
- > correctly. (The exact case above is actually easy to avoid, and I'm
- > surprised that g++ has the error. But try uncommenting the assignment
- > in the above code, and it becomes considerably more difficult to avoid
- > the problem.)
-
- I do NOT understand. There is a problem with
- copy on write implementations, but it has nothing to do with
- reserve(). Rather, the problem occurs because of aliasing:
-
- void f(string &x, string const &y) {
- char &ch = y[0];
- x = "fred";
- }
-
- Is "ch" assured? Nope:
-
- string s = "Hi";
- f(s,s);
-
-
- Consider now:
-
- void f(string &x, string const &y) {
- char &ch = y[0];
- cout << x[0] << ch;
- }
-
- Here, ch might _still_ be invalidated even though there
- is no write operation -- merely binding a non-const
- lvalue into the string _potentially_ allows writing,
- and the string class has no choice but to do the copy
- immediately. C++ overloads on the constness of the object
- and not whether the context is an lvalue or rvalue context.
-
- > One further question occurs: when may reallocation (and the resulting
- > invalidation of the pointers) occur when reserve has not been called?
- > For example, is the following guaranteed to work as expected:
- >
- > string s ;
- > size_t i , j ;
- > assert( i < s.size() , j < s.size() ) ;
- > s[ i ] = s[ j ] ;
- >
- > I would like for the last statement to be well defined, but according to
- > my reading of the standard, it isn't. Reallocation is allowed in the
- > non-const version of operator[] (and must be, if copy on write is to be
- > a legal implementation).
-
- "must be" doesn't follow -- indexing _could_ return
- a proxy object rather than a reference, couldn't it?
-
- > This operator is called twice in the
- > expression, and the compiler could very easily (and legally) generate
- > both calls before using either of the results. But if the second call
- > reallocates, the reference returned by the first call is invalidated.
- > (I cannot actually conceive of an implementation in which the second
- > call reallocates, but I cannot find anything in the draft to guarantee
- > that it won't.)
-
- A naive COW implementation in which the "first" evaluated index operator
- is the RHS above and for which a const alias is used will
- invoke the CONST indexing operator, while evaluating
- the LHS invokes the NONCONST indexing operator and triggers
- reallocation. For example:
-
- string s = "Hello";
- string const &cs = s;
- s[0] = cs[0];
-
- is almost certain to fail in a naive COW implementation
- if the RHS indexing happens to return a lvalue instead
- of an rvalue.
-
- Here's another case:
-
- s.insert(const_iterator, "x");
-
- [No, there's nothing wrong with inserting at a const iterator.
- It is the string object before the dot (.) that must be
- non-const to support insertion, NOT the iterator, which merely
- marks the place where the insertion is to occur.]
- But calling the "insert" method might trigger a reallocation.
-
- I would not be surprised at an implemention
- which split the representation when ANY iterator
- was got from the string -- const or not. The same
- fix can be applied to indexing (requiring a mutable
- pointer inside the string object).
-
- Is there a rule for writing COW that makes it work???
-
- Yes. IF any method returns something that
- binds directly to the representation, it must
- be split BEFORE the binding is done.
-
- For example, if the indexing operator uses
- a suitable proxy, there's no problem.
- Similarly, if the iterators are pairs:
-
- (string*, index)
-
- there's no problem.
-
- My current string class MTLstring does the latter AND is a naive
- (non-COW) implementation AND allows accesses
- "off the end" -- it is highly robust, and VERY hard to break
- without doing "obvious" things (like deleting the string).
- It permits you to insert or erase in the string
- and will NOT invalidate any iterators. In fact,
- my iterators CANNOT be invalidated except by deleting
- the string. They may, of course, magically point to
- a new character after an insertion. [In fact,
- my iterators can be anchored at EITHER end of the
- string, so a "end()-1" iterator is NOT moved
- by an insertion at end()-2, because it's anchored
- to the RHS end of the string, not the LHS end.]
-
- The price is inefficiency: read/write can
- be made very efficient by grabbing a pointer
- to the raw array, but duplicate copying
- make the class unsuitable for som applications.
-
- But I'm just not interested in chasing the kind of
- horrible problems that fragile implementations have, in a class
- I personally conceive as a formatting
- and message passing utility.
-
- A reasonable compromise for a COW implementation
- to make users of the indexing operators
- and STL iterators pay for using non-OO syntax:
-
- s.put(index, chr)
-
- does not suffer from these problems, for example.
-
- --
- John Max Skaller voice: 61-2-566-2189
- 81 Glebe Point Rd fax: 61-2-660-0850
- GLEBE NSW 2037 web: http://www.maxtal.com.au/~skaller/
- AUSTRALIA email: skaller@maxtal.com.au
-
- [ comp.std.c++ is moderated. Submission address: std-c++@ncar.ucar.edu.
- Contact address: std-c++-request@ncar.ucar.edu. The moderation policy
- is summarized in http://dogbert.lbl.gov/~matt/std-c++/policy.html. ]
-
-